Conditional Recurrent Neural Networks
An Application to ACKR3 Ligand Discovery
Mark James Thompson (ACLS)
2025-07-22
Aim of the Project
Problem Statement: The goal is to find a novel chemical entities that will bind to a receptor using a generative deep-learning model.
BUT there are about 10^60 possible clinical candidates! (< 500 Da and meet Lapinski’s rule)
Traditional approach: Search for properties via databases and test.
Generative D-L approach: Train on known entities and their properties, and predict which ones might work
The Target: ACKR3 – a GPC-receptor
- Motivation: ACKR3 is plays a role in:
- angiogenesis (development of new blood vessels);
- tumor growth and metastatic processes;
- neurological conditions; etc.
![]()
Source (Yen 2022)
RNN Extensions:
Olivecrona (2017): gradient clipping
Preuer (2018): 2 CNN layers pre-pended
Kotsias (2020): Conditional with pre-pended QSAR properties
Xu (2021): Conditional Coulomb matrix of the pocket
–> I use a conditional RNN to generate the molecules
Data: Sources
- ACKR3 ligands:
- Vreije Universiteit NL (Riemens 2023);
- InterAx; and,
- papers on ACKR3.
- ZINC dataset: Known clinical molecules 10’000s of entities
- ChEMBL: Ligands known to work with GPCR protein receptors
Data: Aggregation
![]()
KNIME Workflow
Data: Relevant Features (RDKit/gnina)
| ACKR3 binding affinity |
| Synthetic accessibility score |
| Molecular weight |
| Water−octanal partition coefficient |
| Number of hydrogen bond acceptors |
| Number of hydrogen bond donors |
| Number of aromatic rings |
Data
![]()
Distribution of mass (Da) amongst different classes of molecules
Data: Mild Correlation between Binding and pCHEMBL
![]()
KNIME reports a modest but statistcaly significant correlation of 0.236 between gnina docking score the pChemBL (p-value=0.000 on 575 degrees of freedom).
Example of Inverse Agonist
![]()
Example of ACT-1004, an inverse agonist
Model: Trials and Tribulations
- Iterative Process:
VAE graph info sparse encoding into network and features error prone
GAN unstable, hardware issue optimizing
–> The RNN seemed to tune the best out of the box, and give relatively similar SMILEs without any tuning.
Model Architecture: RNN simple, yet powerful
Chemistry can be fairly well expressed in terms of a sequence of elements and radicals
Given the nature of the SMILE text small storage and memory requirements
Simple logic, works on minor hardware (MacPro w/ nVidia GPU)
Clinical properties integrated with conditional RNN architecture
Model: Loss
- Batch size restricted due to hardware limitations
- Optimizer was Adam with a default learning rate of: 0.0001
- Loss function was sparse cross-entropy, which matches the encoding of each character as an integer: \[ L = - \frac{1}{N} \Sigma_{i=1}^T [ln(p_{y=c,i})] \] —
Models: Overview of RNN Models
| Bjerrum (2017a) |
256 |
2 |
0.1 |
1.6m, 13.2m |
2 more FF NN layers post-pended, |
| Bjerrum (2017b) |
64 |
1 |
0.0 |
673 (89k) |
(enumerated, cf. infra) |
| Gupta (2017) |
256 |
2 |
yes |
542k |
lengths 34 to 74 |
| Olivecrona (2017) |
1024 |
3 |
? |
1.5m |
gradient clipping, agent optimizer |
| Segler (2018) |
1024 |
3 |
0.2 + d.o. layers |
1.4m |
gradient clipping |
| Preuer (2018) |
? |
2 |
? |
200k |
2 cnn + maxpool pre-pended |
| Polykovskiy (2020) |
768 |
3 |
0.2 |
1.76m |
CharRNN model |
| Kotsias (2020) |
256 |
3 |
? |
1.34m |
6 dense layers pre-pended for properties |
| Grisoni (2020) |
512 |
2 |
0.3 |
272k |
|
| Xu (2021) |
512 |
2 |
0.3 |
194k |
2 dense layers pre-pended for Coulomb matrix |
Results: Accuracy
- Very low out of sample validity, lots of invalid SMILES (97.5%) depending
- Gains low after 64+ epochs
Results: Validity
Overview of RNN attributes.
| Real |
100.0 |
100 |
0.0 |
0.646 |
3.07 |
3.15 |
0.877 |
| Polykovskiy |
64.6 |
100 |
100.0 |
0.772 |
2.06 |
3.21 |
0.798 |
| Bjerrum |
72.7 |
100 |
100.0 |
0.782 |
2.42 |
3.52 |
0.838 |
| Gupta |
75.0 |
100 |
100.0 |
0.801 |
2.28 |
3.60 |
0.826 |
| Kostias |
2.1 |
100 |
100.0 |
0.061 |
2.91 |
6.38 |
NA |
| Segler |
67.3 |
100 |
100.0 |
0.802 |
2.16 |
3.31 |
0.821 |
| Olivecrona |
0.6 |
100 |
100.0 |
0.095 |
3.38 |
13.00 |
0.715 |
| XuGrisini |
2.7 |
100 |
100.0 |
0.092 |
2.93 |
19.90 |
0.539 |
| 64h512_2m |
79.2 |
100 |
100.0 |
0.774 |
2.26 |
3.33 |
0.832 |
| 64h512_4m |
75.5 |
100 |
99.8 |
0.750 |
2.31 |
3.61 |
0.836 |
| 64h512_8m |
78.6 |
100 |
100.0 |
0.776 |
2.25 |
3.19 |
0.833 |
| 64h1024_4m |
77.1 |
100 |
100.0 |
0.790 |
1.99 |
3.33 |
0.789 |
Results: DeNovo Shifting Distribution Higher
Results: Example DeNovo Molecules
![Molecular view of candidate ligand B. Note the benzene ring and the nitrogen that we saw in the known agonist.]()
Results: de Novo like Training Distribution
![]()
t-SNE
Results: A (un)known SMILE
- String: “Cc1cc(CN)n(n1)[C@H]1CC@@Hc1cc2cc(C)ccc2n1Cc1ccc(Cl)cc1”
- Received a gnina CNN ligand-ACKR3 binding score of 0.961
![]()
Nicotinamide N-methyltransferase (NNMT)
Boltz-2 Simulation: Baseline
![]()
The inverse agonist ACT-1004-1239
Boltz-2 Simulation: Potential Candidate
![]()
Candidate inverse agonist A
Boltz-2 Simulation: Bad Candidate
![]()
Candidate D penetrates the trans-membrane helix: both “bad” docking, and the molecule. We can see a long-chain polymer (orange) penetrates the trans-membrane helices. This candidate is likely spurious or will be disruptive to the GPCR.
Discussion: Challenges
- Low-level hardware issues (galore!): gru activation function, GPU libraries, etc.
- Compute resources scarce: Binding score has taken 7+ months of compute time to generate! Thousands of files, slow cycle times.
- Large IT setup and pre-work: Python versionning and environment, undocumented tools
- Validating the issue encountered is the de novo ligand.
Discussion: Future Directions & Ideas
- Add topological information of the molecule through graph or structural fingerprints
- A larger datasets with all GPCR ligands would be useful including those that have been docked in a complex.
- An interesting idea would be to investigate to include simulation input as a CNN.
Conclusion
- Achievements:
- Scored ~97’000 known candidate ligands for their potential to dock in complex with ACKR3 using gnina’s CNN docking function
- Used binding score to find ~200 novel candidate ligands for ACKR3 obtaining some therapeutic compound leads
- Key Takeaway: Demonstrates the potential of conditional RNNs for developing candidate molecules tailored to a specific domain.
Thank You & Questions?
- Thanks to Tomek at InterAx Biotech for showing me MD and helping me with the the biology
- Thanks Manuel Dömer for putting up with my inner conflict
- Ready for questions.
Appendix: F-Tree Similarity Search
![]()
F-Tree similarity search on VUF-0016840 using infiniSee from BioSolveIT finds many similar matches on a huge library (billions). Compounds may be too similar to the target vs. generative techniques.
Appendix: Feature Summary Statistics
| GLASS |
562 |
463.0917 |
89.21046 |
343.22598 |
2050.0366 |
| GPCR_ligand |
21365 |
422.7739 |
122.97633 |
75.03203 |
2065.0475 |
| ZINC_molecule |
66287 |
349.0975 |
49.96812 |
202.10275 |
495.2169 |
| deNovo |
2762 |
845.8805 |
384.62142 |
159.14919 |
1255.8751 |
| interax |
359 |
405.6960 |
71.15851 |
164.10620 |
600.3788 |
Appendix: Feature Summary Statistics
sLogP Summary Statistic on Scored Data
| GLASS |
562 |
4.374149 |
1.2522874 |
-6.26022 |
6.8349 |
| GPCR_ligand |
21365 |
4.040989 |
1.4639035 |
-9.50710 |
10.8638 |
| ZINC_molecule |
66287 |
3.038343 |
0.9987149 |
-0.13210 |
4.9280 |
| deNovo |
2762 |
7.728784 |
6.4347880 |
-8.60250 |
34.9649 |
| interax |
359 |
3.870362 |
0.9736113 |
0.42854 |
6.7920 |
Appendix: Feature Summary Statistics
gnina Binding Score Summary Statistic on Scored Data
| GLASS |
562 |
0.6859925 |
0.0957217 |
0.0548102 |
0.9527524 |
| GPCR_ligand |
21365 |
0.7133912 |
0.1457480 |
0.0142750 |
0.9837449 |
| ZINC_molecule |
66287 |
0.7668422 |
0.1242323 |
0.0208747 |
0.9906685 |
| deNovo |
2762 |
0.7669212 |
0.2231383 |
0.0930477 |
0.9640365 |
| interax |
359 |
0.7679811 |
0.1044172 |
0.4975969 |
0.9684185 |
Appendix: Feature Summary Statistics
Rotatable Bonds Summary Statistic on Scored Data
| GLASS |
562 |
6.496441 |
2.171971 |
2 |
45 |
| GPCR_ligand |
21365 |
5.559373 |
3.396171 |
0 |
48 |
| ZINC_molecule |
66287 |
4.684932 |
1.600661 |
1 |
8 |
| deNovo |
2762 |
48.868573 |
32.237210 |
0 |
86 |
| interax |
359 |
5.688022 |
2.245454 |
0 |
13 |
Appendix: Improvement of the RNN for SMILE chemistry
We played around with replacement of digraphs, trigraphs with singletons, like.
- “As”: “🜺”, # Alchemical Arsenic
- “Ag”: “☽”, # Alchemical symbol for Silver (Moon)
- “Na”: “钠”, # Nu for Sodium whose Mandarin name is “nà”
- “Be”: “铍”, # Chinese character for Beryllium
- “Bi”: “♆”, # Alchemical Bisthmuth
- …
Gains were small to nil. As the combinations lengthen, there are fewer of them to replace.